synthetic language
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Austria > Vienna (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
M-CIF: Multi-Scale Alignment For CIF-Based Non-Autoregressive ASR
Mao, Ruixiang, Ma, Xiangnan, Yang, Qing, Zhu, Ziming, Qiao, Yucheng, Ge, Yuan, Xiao, Tong, Gao, Shengxiang, Yu, Zhengtao, Zhu, Jingbo
The Continuous Integrate-and-Fire (CIF) mechanism provides effective alignment for non-autoregressive (NAR) speech recognition. This mechanism creates a smooth and monotonic mapping from acoustic features to target tokens, achieving performance on Mandarin competitive with other NAR approaches. However, without finer-grained guidance, its stability degrades in some languages such as English and French. In this paper, we propose Multi-scale CIF (M-CIF), which performs multi-level alignment by integrating character and phoneme level supervision progressively distilled into subword representations, thereby enhancing robust acoustic-text alignment. Experiments show that M-CIF reduces WER compared to the Paraformer baseline, especially on CommonVoice by 4.21% in German and 3.05% in French. To further investigate these gains, we define phonetic confusion errors (PE) and space-related segmentation errors (SE) as evaluation metrics. Analysis of these metrics across different M-CIF settings reveals that the phoneme and character layers are essential for enhancing progressive CIF alignment.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (3 more...)
Searching for the Most Human-like Emergent Language
Boldt, Brendon, Mortensen, David
In this paper, we design a signalling game-based emergent communication environment to generate state-of-the-art emergent languages in terms of similarity to human language. This is done with hyperparameter optimization, using XferBench as the objective function. XferBench quantifies the statistical similarity of emergent language to human language by measuring its suitability for deep transfer learning to human language. Additionally, we demonstrate the predictive power of entropy on the transfer learning performance of emergent language as well as corroborate previous results on the entropy-minimization properties of emergent communication systems. Finally, we report generalizations regarding what hyperparameters produce more realistic emergent languages, that is, ones which transfer better to human language.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Austria > Vienna (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Transfer of Structural Knowledge from Synthetic Languages
Budnikov, Mikhail, Yamshchikov, Ivan
This work explores transfer learning from several synthetic languages to English. We investigate the structure of the embeddings in the fine-tuned models, the information they contain, and the capabilities of the fine-tuned models on simple linguistic tasks. We also introduce a new synthetic language that leads to better transfer to English than the languages used in previous research. Finally, we introduce Tiny-Cloze Benchmark - a new synthetic benchmark for natural language understanding that is more informative for less powerful models. We use Tiny-Cloze Benchmark to evaluate fine-tuned models in several domains demonstrating that fine-tuning on a new synthetic language allows for better performance on a variety of tasks.
- Europe > Germany > Bremen > Bremen (0.04)
- Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
- Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Morphological Typology in BPE Subword Productivity and Language Modeling
This study investigates the impact of morphological typology on tokenization and language modeling performance. We focus on languages with synthetic and analytical morphological structures and examine their productivity when tokenized using the byte-pair encoding (BPE) algorithm. We compare the performance of models trained with similar amounts of data in different languages. Our experiments reveal that languages with synthetic features exhibit greater subword regularity and productivity with BPE tokenization and achieve better results in language modeling tasks. We also observe that the typological continuum from linguistic theory is reflected in several experiments. These findings suggest a correlation between morphological typology and BPE tokenization efficiency.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Saxony > Leipzig (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
The Model Is The Message
An odd controversy appeared in the news cycle last month when a Google engineer, Blake Lemoine, was placed on leave after publicly releasing transcripts of conversations with LaMDA, a chatbot based on a Large Language Model (LLM) that he claims is conscious, sentient and a person. Like most other observers, we do not conclude that LaMDA is conscious in the ways that Lemoine believes it to be. His inference is clearly based in motivated anthropomorphic projection. At the same time, it is also possible that these kinds of artificial intelligence (AI) are "intelligent" -- and even "conscious" in some way -- depending on how those terms are defined. Still, neither of these terms can be very useful if they are defined in strongly anthropocentric ways. An AI may also be one and not the other, and it may be useful to distinguish sentience from both intelligence and consciousness. For example, an AI may be genuinely intelligent in some way but only sentient in the restrictive sense of sensing and acting deliberately on external information. Perhaps the real lesson for philosophy of AI is that reality has outpaced the available language to parse what is already at hand. A more precise vocabulary is essential.
- Health & Medicine (1.00)
- Information Technology > Services (0.34)
Grounding Spatio-Temporal Language with Transformers
Karch, Tristan, Teodorescu, Laetitia, Hofmann, Katja, Moulin-Frier, Clément, Oudeyer, Pierre-Yves
Language is an interface to the outside world. In order for embodied agents to use it, language must be grounded in other, sensorimotor modalities. While there is an extended literature studying how machines can learn grounded language, the topic of how to learn spatio-temporal linguistic concepts is still largely uncharted. To make progress in this direction, we here introduce a novel spatio-temporal language grounding task where the goal is to learn the meaning of spatio-temporal descriptions of behavioral traces of an embodied agent. This is achieved by training a truth function that predicts if a description matches a given history of observations. The descriptions involve time-extended predicates in past and present tense as well as spatio-temporal references to objects in the scene. To study the role of architectural biases in this task, we train several models including multimodal Transformer architectures; the latter implement different attention computations between words and objects across space and time. We test models on two classes of generalization: 1) generalization to randomly held-out sentences; 2) generalization to grammar primitives. We observe that maintaining object identity in the attention computation of our Transformers is instrumental to achieving good performance on generalization overall, and that summarizing object traces in a single token has little influence on performance. We then discuss how this opens new perspectives for language-guided autonomous embodied agents. We also release our code under open-source license as well as pretrained models and datasets to encourage the wider community to build upon and extend our work in the future.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Austria > Vienna (0.04)
Researchers say we need better benchmarks to build more useful AI assistants
The promise of conversational AI is that, unlike virtually any other form of technology, all you have to do is talk. Natural language is the most natural and democratic form of communication. After all, humans are born capable of learning how to speak, but some never learn to read or use a graphical user interface. That's why AI researchers from Element AI, Stanford University, and CIFAR recommend academic researchers take steps to create more useful forms of AI that speak with people to get things done, including the elimination of existing benchmarks. "As many current [language user interface] benchmarks suffer from low ecological validity, we recommend researchers not to initiate incremental research projects on them. Benchmark-specific advances are less meaningful when it is unclear if they transfer to real LUI use cases. Instead, we suggest the community to focus on conceptual research ideas that can generalize well beyond the current datasets," the paper reads.